NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

HiPerMotif: Novel Parallel Subgraph Isomorphism in Large-Scale Property Graphs

https://doi.org/10.1109/HPEC67600.2025.11196090

Dindoost, Mohammad; Rodriguez, Oliver Alvarado; Bryg, Bartosz; Koutis, Ioannis; Bader, David A (September 2025, IEEE)

Subgraph isomorphism algorithms face significant scalability bottlenecks on large-scale property graphs due to inefficient vertex-by-vertex search that requires extensive exploration of early search tree levels where pruning is minimal. We present HiPerMotif, a hybrid parallel algorithm that overcomes these limitations through edge-centric initialization. HiPerMotif first reorders pattern graphs to prioritize high-connectivity vertices, then systematically identifies and validates all possible first-edge mappings before injecting these pre-validated partial states directly at search depth 2. This approach eliminates costly early exploration while enabling natural parallelization over independent edge candidates. Comprehensive evaluation against state-of-the-art baselines (VF2-PS, VF3P, Glasgow) demonstrates up to 66x speedup on real-world networks and successful processing of massive datasets like the 150M-edge H01 human connectome that cause existing methods to fail due to memory constraints. Implemented in the open-source Arkouda/Arachne framework, HiPerMotif enables previously intractable large-scale network analysis in computational neuroscience and related domains.
more » « less
Free, publicly-accessible full text available September 15, 2026
Prompt Wrangling: On Replication and Generalization in Large Language Models for PCG Levels

https://doi.org/10.1145/3649921.3659853

Moradi_Karkaj, Arash; Nelson, Mark J; Koutis, Ioannis; Hoover, Amy K (May 2024, ACM)

The ChatGPT4PCG competition calls for participants to submit inputs to ChatGPT or prompts that guide its output toward instructions to generate levels as sequences of Tetris-like block drops. Prompts submitted to the competition are queried by ChatGPT to generate levels that resemble letters of the English alphabet. Levels are evaluated based on their similarity to the target letter and physical stability in the game engine. This provides a quantitative evaluation setting for prompt-based procedural content generation (PCG), an approach that has been gaining popularity in PCG, as in other areas of generative AI. This paper focuses on replicating and generalizing the competition results. The replication experiments in the paper first aim to test whether the number of responses gathered from ChatGPT is sufficient to account for the stochasticity. We requery the original prompt submissions and rerun the original scripts from the competition, on different machines, about six months after the competition. We find that results largely replicate, except that two of the 15 submissions do much better in our replication, for reasons we can only partly determine. When it comes to generalization, we notice that the top-performing prompt has instructions for all 26 target levels hardcoded, which is at odds with the PCGML goal of generating new, previously unseen content from examples. We perform experiments in more restricted zero-shot and few-shot prompting scenarios, and find that generalization remains a challenge for current approaches.
more » « less
Full Text Available
A generalized Cheeger inequality

https://doi.org/10.1016/j.laa.2023.01.014

Koutis, Ioannis; Miller, Gary; Peng, Richard (May 2023, Linear Algebra and its Applications)

Full Text Available
SpecPart: A Supervised Spectral Framework for Hypergraph Partitioning Solution Improvement

https://doi.org/10.1145/3508352.3549390

Bustany, Ismail; Kahng, Andrew B.; Koutis, Ioannis; Pramanik, Bodhisatta; Wang, Zhiang (October 2022, Proceedings of the 41st IEEE/ACM International Conference on Computer-Aided Design)

State-of-the-art hypergraph partitioners follow the multilevel paradigm that constructs multiple levels of progressively coarser hypergraphs that are used to drive cut refinements on each level of the hierarchy. Multilevel partitioners are subject to two limitations: (i) Hypergraph coarsening processes rely on local neighborhood structure without fully considering the global structure of the hypergraph. (ii) Refinement heuristics can stagnate on local minima. In this paper, we describe SpecPart, the first supervised spectral framework that directly tackles these two limitations. SpecPart solves a generalized eigenvalue problem that captures the balanced partitioning objective and global hypergraph structure in a low-dimensional vertex embedding while leveraging initial high-quality solutions from multilevel partitioners as hints. SpecPart further constructs a family of trees from the vertex embedding and partitions them with a tree-sweeping algorithm. Then, a novel overlay of multiple tree-based partitioning solutions, followed by lifting to a coarsened hypergraph, where an ILP partitioning instance is solved to alleviate local stagnation. We have validated SpecPart on multiple sets of benchmarks. Experimental results show that for some benchmarks, our SpecPart can substantially improve the cutsize by more than 50% with respect to the best published solutions obtained with leading partitioners hMETIS and KaHyPar.
more » « less
Full Text Available
Ensemble Learning as a Peer Process

Beikihassan, Ehsan; Hoover, Amy; Koutis, Ioannis; Parviz, Ali (April 2022, Agent Learning in Open-Endedness (ALOE) Workshop. ICRL 2022)

Ensemble learning, in its simplest form, entails the training of multiple models with the same training set. In a standard supervised setting, the training set can be viewed as a 'teacher' with an unbounded capacity of interactions with a single group of 'trainee' models. One can then ask the following broad question: How can we train an ensemble if the teacher has a bounded capacity of interactions with the trainees? Towards answering this question we consider how humans learn in peer groups. The problem of how to group individuals in order to maximize outcomes via cooperative learning has been debated for a long time by social scientists and policymakers. More recently, it has attracted research attention from an algorithmic standpoint which led to the design of grouping policies that appear to result in better aggregate learning in experiments with human subjects. Inspired by human peer learning, we hypothesize that using partially trained models as teachers to other less accurate models, i.e.~viewing ensemble learning as a peer process, can provide a solution to our central question. We further hypothesize that grouping policies, that match trainer models with learner models play a significant role in the overall learning outcome of the ensemble. We present a formalization and through extensive experiments with different types of classifiers, we demonstrate that: (i) an ensemble can reach surprising levels of performance with little interaction with the training set (ii) grouping policies definitely have an impact on the ensemble performance, in agreement with previous intuition and observations in human peer learning.
more » « less
Full Text Available
A Novel Calibration Step in Gene Co-Expression Network Construction

https://doi.org/10.3389/fbinf.2021.704817

Aghaieabiane, Niloofar; Koutis, Ioannis (November 2021, Frontiers in Bioinformatics)

High-throughput technologies such as DNA microarrays and RNA-sequencing are used to measure the expression levels of large numbers of genes simultaneously. To support the extraction of biological knowledge, individual gene expression levels are transformed to Gene Co-expression Networks (GCNs). In a GCN, nodes correspond to genes, and the weight of the connection between two nodes is a measure of similarity in the expression behavior of the two genes. In general, GCN construction and analysis includes three steps; 1) calculating a similarity value for each pair of genes 2) using these similarity values to construct a fully connected weighted network 3) finding clusters of genes in the network, commonly called modules. The specific implementation of these three steps can significantly impact the final output and the downstream biological analysis. GCN construction is a well-studied topic. Existing algorithms rely on relatively simple statistical and mathematical tools to implement these steps. Currently, software package WGCNA appears to be the most widely accepted standard. We hypothesize that the raw features provided by sequencing data can be leveraged to extract modules of higher quality. A novel preprocessing step of the gene expression data set is introduced that in effect calibrates the expression levels of individual genes, before computing pairwise similarities. Further, the similarity is computed as an inner-product of positive vectors. In experiments, this provides a significant improvement over WGCNA, as measured by aggregate p -values of the gene ontology term enrichment of the computed modules.
more » « less
Full Text Available
Spectral Hypergraph Partitioning Revisited

Pramanik, Bodhisatta; Koutis, Ioannis (July 2021, SIAM Conference on Applied and Computational Discrete Algorithms)

Most state-of-the-art hypergraph partitioning algorithms follow a multilevel approach that constructs a hierarchy of coarser hypergraphs that in turn is used to drive partition refinements. These partitioners are widely accepted as the current standard, as they have proven to be quite effective. On the other hand, spectral partitioners are considered to be less effective in cut quality, and too slow to be used in industrial applications. In this work, we revisit spectral hypergraph partitioning and we demonstrate that the use of appropriate solvers eliminates the running time deficiency; in fact, spectral algorithms can compute competing solutions in a fraction of the time needed by standard partitioning algorithms, especially on larger designs. We also introduce several novel modifications in the common spectral partitioning workflow, that enhance significantly the quality of the computed solutions. We run our partitioner on FPGA benchmarks generated by an industry leader, generating solutions that are directly competitive both in runtime and quality.
more » « less
Full Text Available
Peer Learning Through Targeted Dynamic Groups Formation

https://doi.org/10.1109/ICDE51399.2021.00018

Wei, Dong; Koutis, Ioannis; Roy, Senjuti Basu (April 2021, 2021 IEEE 37th International Conference on Data Engineering (ICDE))

Peer groups leverage the presence of knowledgeable individuals in order to increase the knowledge level of other participants. The `smart' formation of peer groups can thus play a crucial role in educational settings, including online social networks and learning platforms. Indeed, the targeted groups formation problem, where the objective is to maximize a measure of aggregate knowledge, has received considerable attention in recent literature. In this paper we initiate a dynamic variant of the problem that, unlike previous works, allows the change of group composition over time while still targeting to maximize the aggregated knowledge level. The problem is studied in a principled way, using a realistic learning gain function and for two different interaction modes among the group members. On the algorithmic side, we present DyGroups, a generic algorithmic framework that is greedy in nature and highly scalable. We present non-trivial proofs to demonstrate theoretical guarantees for DyGroups in a special case. We also present real peer learning experiments with humans, and perform synthetic data experiments to demonstrate the effectiveness of our proposed solutions by comparing against multiple appropriately selected baseline algorithms.
more » « less
Full Text Available
Peer Learning Through Targeted Dynamic Groups Formation

Wei, Dong; Koutis, Ioannis; Basu Roy, Senjuti (April 2021, International Conference on Data Engineering)
null (Ed.)
Full Text Available

Search for: All records